-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[flang] Optimize acospi
precision
#152869
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-flang-fir-hlfir @llvm/pr-subscribers-llvm-support Author: Connector Switch (c8ef) ChangesFull diff: https://github.com/llvm/llvm-project/pull/152869.diff 3 Files Affected:
diff --git a/flang/lib/Optimizer/Builder/IntrinsicCall.cpp b/flang/lib/Optimizer/Builder/IntrinsicCall.cpp
index bfa470dd5690d..3e6fbafe8a6b3 100644
--- a/flang/lib/Optimizer/Builder/IntrinsicCall.cpp
+++ b/flang/lib/Optimizer/Builder/IntrinsicCall.cpp
@@ -2687,10 +2687,10 @@ mlir::Value IntrinsicLibrary::genAcospi(mlir::Type resultType,
mlir::FunctionType ftype =
mlir::FunctionType::get(context, {resultType}, {args[0].getType()});
mlir::Value acos = getRuntimeCallGenerator("acos", ftype)(builder, loc, args);
- llvm::APFloat inv_pi = llvm::APFloat(llvm::numbers::inv_pi);
- mlir::Value dfactor =
- builder.createRealConstant(loc, mlir::Float64Type::get(context), inv_pi);
- mlir::Value factor = builder.createConvert(loc, resultType, dfactor);
+ llvm::APFloat inv_pi =
+ llvm::APFloat(llvm::cast<mlir::FloatType>(resultType).getFloatSemantics(),
+ llvm::numbers::inv_pis);
+ mlir::Value factor = builder.createRealConstant(loc, resultType, inv_pi);
return mlir::arith::MulFOp::create(builder, loc, acos, factor);
}
diff --git a/flang/test/Lower/Intrinsics/acospi.f90 b/flang/test/Lower/Intrinsics/acospi.f90
index dcacd25bca480..facb57abcdbcc 100644
--- a/flang/test/Lower/Intrinsics/acospi.f90
+++ b/flang/test/Lower/Intrinsics/acospi.f90
@@ -10,8 +10,7 @@ function test_real4(x)
! CHECK-LABEL: @_QPtest_real4
! CHECK-PRECISE: %[[acos:.*]] = fir.call @acosf({{%[A-Za-z0-9._]+}}) fastmath<contract> : (f32) -> f32
! CHECK-FAST: %[[acos:.*]] = math.acos %{{.*}} : f32
-! CHECK: %[[dpi:.*]] = arith.constant 0.31830988618379069 : f64
-! CHECK: %[[inv_pi:.*]] = fir.convert %[[dpi]] : (f64) -> f32
+! CHECK: %[[inv_pi:.*]] = arith.constant 0.318309873 : f32
! CHECK: %{{.*}} = arith.mulf %[[acos]], %[[inv_pi]] fastmath<contract> : f32
function test_real8(x)
diff --git a/llvm/include/llvm/Support/MathExtras.h b/llvm/include/llvm/Support/MathExtras.h
index 7bbf1b3aab761..851fe4829b675 100644
--- a/llvm/include/llvm/Support/MathExtras.h
+++ b/llvm/include/llvm/Support/MathExtras.h
@@ -74,6 +74,8 @@ constexpr float ef = 0x1.5bf0a8P+1F, // (2.71828183) https://oeis.org/A
sqrt3f = 0x1.bb67aeP+0F, // (1.73205081) https://oeis.org/A002194
inv_sqrt3f = 0x1.279a74P-1F, // (.577350269)
phif = 0x1.9e377aP+0F; // (1.61803399) https://oeis.org/A001622
+constexpr const char *pis = "3.141592653589793238462643383279502884",
+ *inv_pis = "0.318309886183790671537767526745028724";
// clang-format on
} // namespace numbers
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for coming back to this. Please could you add a test for f128 like in your issue.
I can do this, but I'm concerned it might break the AIX build bot. It seems they use a different fp format. We may need to add separate fp numbers based on the target triple to address this? (Or simply omit the f128 test... |
Try |
34920d8
to
bcdd26a
Compare
I'm actually referring to the following case: #include "llvm/ADT/APFloat.h"
#include "llvm/Support/raw_ostream.h"
int main() {
llvm::APFloat pi(llvm::APFloat::IEEEquad(),
"0.318309886183790671537767526745028724");
pi.print(llvm::errs());
llvm::APFloat pi_ppc(llvm::APFloat::PPCDoubleDouble(),
"0.318309886183790671537767526745028724");
pi_ppc.print(llvm::errs());
}
// 0.318309886183790671537767526745028737
// 0.31830988618379067153776752674503 They have different rounding, so an exact match is difficult. I chose to match the common prefix instead. It should work on all build bots now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for being so careful not to break the build bots.
bcdd26a
to
5920e94
Compare
Merge activity
|
1080d68
to
ed4ff68
Compare
ed4ff68
to
36f6e4b
Compare
This patch implements the solution for
acospi
proposed in #150452, improving precision for r16 and removing a redundant cast for r4.